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Abstract. This is a survey of the recent progress and open questions on the struc- 
ture of the sets of 0-1 and non-negative integer matrices with prescribed row and 
column sums. We discuss cardinality estimates, the structure of a random matrix 
from the set, discrete versions of the Brunn-Minkowski inequality and the statistical 
dependence between row and column sums. 



1. Introduction 

Let R = (ri, . . . , r m ) and C = (ci, . . . , c n ) be positive integer vectors such that 
(1.1) r 1 + ... + r m = c 1 + ... + c n = N. 

We consider the set Aq(R, C) of all m x n matrices D = (d{j) with 0-1 entries, row 
sums R and column sums C: 

A (R, C) = < D = (dij) : ^e% = r; for i = l,...,m 
I 3=1 



= Cj for j = 1, ... ,7i 

i=l 

dij e {0,1} I. 
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Typeset by Aj^S-T^i 



We also consider the set A + (R, C) of non- negative integer m x n matrices with 
row sums R and column sums C: 

A + (R,C) = lD = (d lJ ): f>,= 

m 

i=i 

dij G 

Vectors R and C are called margins of matrices from Aq(R,C) and A + (R,C). 
We reserve notation N for the sums of the coordinates of R and C in (1.1) and 
write \R\ = \C\ = N. 

While the set A+(R,C) is non-empty as long as the balance condition (1.1) 
is satisfied, a result of Gale and Ryser (see, for example, Section 6.2 of [BR91]) 
provides a necessary and sufficient criterion for set Aq(R, C) to be non-empty. Let 
us assume that 

m > c\ > C2 > • • • > c n > and that 
n > Ti > for i = l,...,n. 

Set A (R, C) is not empty if and only if (1.1) holds and 

m k 

^^min{ri,/c} > Cj for k = l,...,n. 

i=i j=i 

Assuming that Aq(R, C) ^ 0, we are interested in the following questions: 

• What is the cardinality \A (R, C)\ of A (R, C) and the cardinality \A + (R, C)\ 
of A + (R,C)? 

• Let us us consider A (R 7 C) and A + (R,C) as finite probability spaces with 
the uniform measure. What a random matrix D G Aq(R, C) and a random matrix 
D G A+(R, C) are likely to look like? 

The paper is organized as follows. 

In Section 2 we estimate of \A (R : C) \ within an (mn)°( m+n ) factor and in Sec- 
tion 3 we estimate | A + (R, C) | within an N°(' m + n ) factor. In all but very sparse cases 
this way we obtain asymptotically exact estimates of In \Aq(R, C) \ and In \A+(R, C) \ 
respectively. The estimate of Section 2 is based on a representation of \Aq(R,C)\ 
as the permanent of a certain mn x mn matrix of O's and l's, while the estimate 
of Section 3 is based on a representation of \A+(R, C)\ as the expectation of the 
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- Ti for i = 1, . . . , m 
■ Cj for j = 1, . . . , n 



permanent of a certain N x N random matrix with exponentially distributed en- 
tries. In the proofs, the crucial role is played by the van der Waerden inequality for 
permanents of doubly stochastic matrices. The cardinality estimates are obtained 
as solutions to simple convex optimization problems and hence are efficiently com- 
putable, although they cannot be expressed by a "closed formula" in the margins 
(R, C) . Our method is sufficiently robust as the same approach can be applied to 
estimate the cardinality of the set of matrices with prescribed margins and with O's 
in prescribed positions. 

In Sections 4 and 5 we discuss some consequences of the formulas obtained in 
Sections 2 and 3. In particular, in Section 4, we show that the numbers \Aq(R, C)\ 
and \A + {R,C)\ are both approximately log-concave as functions of the margins 
(R, C) . We note an open question whether these numbers are genuinely log-concave 
and give some, admittedly weak, evidence that it may be the case. In Section 5, we 
discuss statistical dependence between row and column sums. Namely, we consider 
finite probability spaces of m x n non-negative integer or 0-1 matrices with the 
total sum N of entries and two events in those spaces: event 1Z consisting of the 
matrices with row sums R and event C consisting of the matrices with column sums 
C. It turns out that 0-1 and non-negative integer matrices exhibit opposite types of 
behavior. Assuming that the margins R and C are sufficiently far away from sparse 
and uniform, we show that for 0-1 matrices the events 1Z and C repel each other 
(events 1Z and C are negatively correlated) while for non-negative integer matrices 
they attract each other (the events are positively correlated) . 

In Section 6, we discuss what random matrices D e Aq(R, C) and D e A + (R, C) 
look like. We show that in many respects, a random matrix D e Aq(R, C) behaves 
like anmxn matrix X of independent Bernoulli random variables such that E X = 
Zq where Zq is a certain matrix, called the maximum entropy matrix, with row sums 
.R, column sums C and entries between and 1. It turns out that Z is the solution 
to an optimization problem, which is convex dual to the optimization problem 
of Section 2 used to estimate \Aq(R,C)\. On the other hand, a random matrix 
D G A + (R, C) in many respects behaves like an m x n matrix X of independent 
geometric random variables such that E X = Z+ where Z + is a certain matrix, also 
called the maximum entropy matrix, with row sums R, column sums C and non- 
negative entries. It turns out that Z+ is the solution to an optimization problem 
which is convex dual to the optimization problem of Section 3 used to estimate 
\A + (R, C)\. It follows that in various natural metrics matrices D e Aq(R,C) 
concentrate about Zq while matrices D e A + (R,C) concentrate about Z + . We 
note some open questions on whether individual entries of random D e Ao(R,C) 
and random D e A + (R,C) are asymptotically Bernoulli, respectively geometric, 
with the expectations read off from Zq and Z + . 

In Section 7, we discuss asymptotically exact formulas for \Aq(R, C)\ and 
\A + (R,C)\. Those formulas are established under essentially more restrictive con- 
ditions than cruder estimates of Sections 2 and 3. We assume that the entries of 
the maximum entropy matrices Zq and Z + are within a constant factor, fixed in 
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advance, of each other. Recall that matrices Z and Z + characterize the typical 
behavior of random matrices D e Aq(R,C) and D e C) respectively. In 

the case of 0-1 matrices our condition basically means that the margins (R, C) lie 
sufficiently deep inside the region defined by the Gale-Ryser inequalities. As the 
margins approach the boundary, the number \Aq(R, C) | gets volatile and hence can- 
not be expressed by an analytic formula like the one described in Section 7. The 
situation with non-negative integer matrices is less clear. It is plausible that the 
number \A+(R, C)\ experiences some volatility when some entries of Z+ become 
abnormally large, but we don't have a proof of that happenning. 

In Section 8, we mention some possible ramifications, such as enumeration of 
higher-order tensors and graphs with given degree sequences. 

The paper is a survey and although we don't provide complete proofs, we often 
sketch main ideas of our approach. 

2. The logarithmic asymptotic for the number of 0-1 matrices 
The following result is proven in [BalOa]. 
(2.1) Theorem. Given positive integer vectors 

R = (n, . . . ,r m ) and C = (ci, . . . , c n ) , 

let us define the function 

*b(x,y)= (f[x- ri ) (flyp) (n( 1 + ^) 

\i=i J \j=i J 

for x = (xi, . . . ,x m ) and y = (y lt . . . , y n ) 

and let 

a (R,C)= inf F (x,y). 

2/i,... ,y n >0 

Then the number Aq(R, C) ofmxn zero-one matrices with row sums R and column 
sums C satisfies 



( 



n c, 



a (R,C) > \ M R,C)\ > I H (" ( -^~ | n^l"o(fl,C). 



V 



Using Stirling's formula, 



one can notice that the ratio between the upper and lower bounds is (mn)°( m+n ). 
Indeed, the "e _s " terms cancel each other out, since 

e -mn (^J\e n - r ^j ^II eCj j =L 

Thus, for sufficiently dense 0-f matrices, where we have \A (R, C)\ = 2 n ^ rnn \ we 
have an asymptotically exact formula 

In \Ao(R, C)\ ~ \nao(R, C) as m, n — > +00. 

(2.2) A convex version of the optimization problem. Let us substitute 

Xi = e Si for i = 1, . . . , m and yj = e tj for j = 1, . . . , n 
in F (x,y). Denoting 

m n 

G (s, t) = - £ r lSl - t jCj + ln (1 + e " +tj ) 

for s= (s 1 ,. . . ,s m ) and t = (t u . . . , t n ) , 

we obtain 

In a (R,C)= inf G (s,t). 

s 1 ■>•■• 
1 1 j ■ ■ ■ ,£tl 

We observe that Go(s,t) is a convex function on R m+n . In particular, one can 
compute the infimum of Go efficiently by using interior point methods, see, for 
example, [NN94]. 

(2.3) Sketch of proof of Theorem 2.1. The upper bound for \Ao(R, C)\ is 
immediate: it follows from the expansion 

\[{l + x t y J ) = Y J \MR,C)\* R y C , 

ij R,C 

where 

x R = xl 1 ■ ■ ■ x r ™ and y c = y ^...y^ 

for R = (n, . . . , r m ) and C = (ci, . . . , c n ) and the sum is taken over all pairs 
of non-negative integer vectors R = (r\, . . . ,r m ) and C = (c\, . . . , c n ) such that 
ri + . . . + r m = ci + . . . + c n < ran. 

To prove the lower bound, we express | Aq (R, C) \ as the permanent of an mn x mn 
matrix. Recall that the permanent of a k x k matrix B = (bij) is defined by 

k 

perS = Yl b i<r(i), 

o-eS k i=i 
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where the sum is taken over the symmetric group Sk of all permutations a of the 
set {1, . . . , &;}, see, for example, Chapter 11 of [LW01]. One can show, see [BarlOa] 
for details, that 



where B is the ran x ran matrix of the following structure: 

the rows of B are split into distinct ra + n blocks, the m blocks of type I having 
n — ri,... ,n — r m rows respectively and n blocks of type II having ci, . . . , c n rows 
respectively; 

the columns of B are split into ra distinct blocks of n columns each; 

for i = 1, . . . , m, the entry of B that lies in a row from the z-th block of rows of 
type I and a column from the z-th block of columns is equal to 1; 

for i = 1, . . . , m and j = 1, . . . , n, the entry of B that lies in a row from the 
j-th block of rows of type II and the j-th column from the i-th block of columns is 
equal to 1; 

all other entries of B are 0. 

Suppose that the infimum of function Go(s, t) defined by (2.2.1) is attained at a 
particular point s = (s\, . . . , s m ) and t = (ti, . . . , t n ) (the case when the infimum 
is not attained is handled by an approximation argument). Let Xi = exp{sj} for 
i = l,... ,ra and i/j = exp {tj } for j = 1, . . . , n. 

Setting the gradient of Gq(s, t) to 0, we obtain 



n 



Y] , XlVj = n for i = 1, 

^— f 1 + XiVi 



rri 1 + x iVj 

(2.3.2) 3 



, ra 



V * 3 = Cj for j = 1, . . . , n. 



Let us consider a matrix B' obtained from matrix B as follows: 

for % = 1, . . . , ra we multiply every row of B in the i-th block of type I by 

1 



Xi(n - Ti) ' 

for j = 1, . . . , n, we multiply every row of B in the j-th block of type II by 

6 



for i = 1, . . . , m and j = l,...,nwe multiply the j'-th column in the i-th block 
of columns of B by 

Xi 

1 + Xiyj ' 



Then 



\i=i J \j=i J \ ij 

On the other hand, equations (2.3.2) imply that the row and column sums of B' are 
equal to 1, that is, B' is doubly stochastic. Applying the van der Waerden bound for 
permanents of doubly stochastic matrices, see, for example, Chapter 12 of [LW01], 
we conclude that 

. (mn)l 
(mn) rnn 

which, together with (2.3.1) completes the proof. □ 
One can prove a version of Theorem 2.1 for 0-1 matrices with prescribed row 
and column sums and prescribed zeros in some positions. 

3. The logarithmic asymptotics for the 
number of non-negative integer matrices 

The following result is proven in [Ba09]. 

(3.1) Theorem. Let R = (r\,... ,r m ) and C = (c\, . . . , c n ) be positive integer 
vectors such that r\ + . . . + r m = c\ + . . . + c n = N. Let us define a function 



F + (x, y) =(n*r) (flv-A \Ut^- x 

\i=i / \j=i ) \ ij 



X-iHj 

for x = (xi, . . . ,x m ) and y = (y 1 , . . . , y n ) . 
Then F + (x, y) attains its minimum 

a+{R,C)= min F+(x,y) 

0<xi,... ,x m <l 
0<yi,..- ,y n <l 

on the open cube < Xi,yj < 1 and for the number \A + (R, C)\ of non-negative 
integer m x n matrices with row sums R and column sums C , we have 

a+{R,C) > \A+(R,C)\ > N-^ m+ ^a+(R,C), 

where 7 > is an absolute constant. 



For sufficiently dense matrices, where 

min Ti = Q(n) and min Cj = 0(m) 

i=l,...,m j=l,...,n 

we have \A + (R, C)\ = (iV/mn) n '™ n ' and hence we obtain an asymptotically exact 
formula 

In \A + (R, C)\ w lna + (i?, C) as m, n — )> +00. 
(3.2) A convex version of the optimization problem. Let us substitute 

Xi = e~ Si for i = 1, . . . , m and yj = e~ tj for j = 1, . . . , n 

in F_|_(x, y). Denoting 



(3.2.1) 



we obtain 



i=l j = l i,j 

for s = (si, . . . ,s m ) and t = . . . , t n ) , 



In a + (R, C) = mm G+(s, t). 

si,... ,s m >0 
ti,...t n >0 



We observe that G+(s, t) is a convex function on ]R m+n . In particular, one can 
compute the minimum of G + efficiently by using interior point methods [NN94]. 

(3.3) Sketch of proof of Theorem 3.1. The upper bound for \ A + (R, C) \ follows 
immediately from the expansion 

\\ 73- = ^\A + {R,C)\x. R y c for < x lt . . . , x m , y u . . . , y n < 1 

J- XjVn „ „ 



y - Xiy i r,c 



where 

x fl = X? ■ ■ ■ x 7 ^ and y c =y c 1 1 --- y c ^ 



for R = (n, . . . , r m ) and C = (ci, . . . , c n ) and the sum is taken over all pairs 
of non-negative integer vectors R = (r\, . . . ,r m ) and C = (c\, . . . , c n ) such that 
r\ + • • • + r m = ci + . . . + Cn. 

To prove the lower bound, we express \A + (R, C) \ as the integral of the permanent 
of an iV x N matrix with variable entries. For an m x n matrix Z = (zij) we define 
the N x N matrix B(Z) as follows: 

the rows of B(Z) are split into m distinct blocks of sizes ri, . . . , r m respectively; 

the columns of B(Z) are split into n distinct blocks of sizes ci, . . . ,c n respec- 
tively; 



for i = 1, . . . , m and j = 1, . . . , n, the entry of B(Z) that lies in a row from the 
z-th block of rows and in a column from the j-th block of columns is Zij . 

Then there is a combinatorial identity 

(m \ / n \ ^ dij 

II II ' E II 
i=l J V = l / DeA + {R,C) ij 13 ' 
D={dij) 

cf. [Be74], which implies that 



Zij > dZ. 



Here the integral is taken over the set M.™ n of m x n matrices Z with positive 
entries. Let A mn _i C R™ n be the standard (mn — l)-dimensional simplex defined 
by the equation 

Y^Zij = 1. 

Since perS(Z) is a homogeneous polynomial in Z of degree N, we have 

where dZ is the Lebesgue measure on A mn _! induced from IR mn . 

Let s = (si,... , s m ) and t = (£1,... , i n ) be the minimum point of function 
G+(s,t) defined by (3.2.1). Let Xi = exp{si} for % = 1, . . . , m and yj = exp{tj} 
for j = 1, . . . , m. Setting the gradient of G+(s, t) to 0, we obtain 



n 



E T 



Tj for 2 = 1 m 



(3.3.2) J 



sr^ — iVj — _ £ 7 = 1 n. 
&l-XiVi 3 

Let us consider the affine subspace L C IR mn ofmxn matrices Z = (z^j) defined 
by the system of equations 

n 

^ XiVjZij = N + mn f ° r z = X ' • • • ' m and 
(3.3.3) 3=1 

Ew*:; = T7X^7 for J = l, 



iV + mn 
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We note that dimL = (m — l)(n — 1). 

Suppose that Z G A mn _i flL and consider the corresponding matrix B(Z). If we 
multiply every row in the i-th block of rows by Xiy/N + mn/ri and every column 
in the j-th block of columns by yjy/N + mn/cj, by (3.3.3) we obtain a doubly 
stochastic matrix B'(Z) for which we have per B'(Z) > N\/N N by the van der 
Waerden inequality. Summarizing, 

per B(Z) > 

(3.3.4) 

for all Z G A mn _i fl L. 

It remains to show that the intersection A mn _i fl L is sufficiently large, so that 
the contribution of a neighborhood of the intersection to the integral (3.3.1) is 
sufficiently large. It follows by (3.3.2)-(3.3.3) that A mn _! fl L contains matrix 
Z = (zij) where 




z ij = , Ar — \ f° r an h3- 



1 

(N + ran) (1 — XiUj) 

In [Ba09], we prove a geometric lemma which states that if A^_i C is the stan- 
dard (d — l)-dimensional simplex that is the intersection of the affine hyperplane H 
defined by the equation x\ + . . .+Xd = 1 and the positive orthant x\ > 0, . . . , xa > 
and if L C H is an affine subspace of codimension k in H such that L contains a 
point a G A^_i, a = (a±, . . . , a^), then for the volume of the intersection A^_i fl L 
we have the lower bound 

voLj_ fc _i (A d _! n L) > -^^"50!...^, 

where 



r(fe/2 + i) 

is the volume of the /c-dimensional unit ball and 7 > is an absolute constant. 
Applying this estimate in our situation, we conclude that 

VOl mn _fc (A mn _i fl L) > {mn)0{ m+n) {N + mn)mn II ! _ — ' 

where k = m + n — lor/c = m + n depending whether or not L lies in the 
affine hyperplane ^ - ■ 2^ = 1. This allows us to obtain a similar bound for the 
volume of a small neighborhood of the intersection A mn _! fl L. Because per B(Z) 
is a homogeneous polynomial in Z of degree N, inequality (3.3.4) holds in the e- 
neighborhood of the intersection A mn _i fl L for e = jv -0 ^" 1 "" - ) up to an j\i ( m + n ) 
factor. Using it together with (3.3.1), we complete the proof of Theorem 3.1. □ 
One can prove a version of Theorem 3.1 for non- negative integer matrices with 
prescribed row and column sums and with prescribed zeros in some positions. 
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4. Discrete Brunn - Minkowski inequalities 

Theorems 2.1 and 3.1 allow us to establish approximate log-concavity of the 
numbers A (R,C) and A + (R,C). 

For a non-negative integer vector B = (bi, . . . , b p ), we denote 

p 

|£| = I> 

i=i 

(4.1) Theorem. Let R\, . . . , R p be positive integer m-vectors and let Ci, . . . ,C P 

be positive integer n-vectors such that \R±\ = \Ci\, . . . ,\R P \ = \C P \. 

Let Pi, . . . ,fl p > be real numbers such that Pi + . . . + f3 p = 1 and such that 
R = fiiRi + . . . + /3 p Rp is a positive integer m-vector and C = fiiCi + . . . + /3 P C P 
is a positive integer n-vector. Let N = \R\ = \C\. 

Then for some absolute constant 7 > we have 

(1) 

p 

(mn)^ m +^\A (R,C)\ > ]J\A (R k ,C k )f k 

k=l 

and 

(2) 

N^ m + n ^\A + (R,C)\ > f[\A + (R k ,C k )f k . 

k=i 

Proof. Let us denote function Fq of Theorem 2.1 for the pair (R k ,C k ) by F k and 
for the pair (R, C) just by F. Then 

p 

(4-1.1) F(x,y) = n^f (x,y) 

fc=i 

and hence 

p 

a (R,C) > ]J(ao(R k ,C k )f k . 
k=i 

Part (1) now follows by Theorem 2.1. 

Similarly, we obtain (4.1.1) if we denote function F+ of Theorem 3.1 for the pair 
(R k , C k ) by F k and for the pair (R, C) just by F. Hence 

p 

a+(R,C) > ]J(a + (R k ,C k )f k . 
k=i 



Part (2) now follows by Theorem 3.1. 
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□ 



Teorem 2.1 implies a more precise estimate 

where R = {r u . . . ,r m ) and C = {c u . . . , c n ). 
In [Ba07] a more precise estimate 

N N [ m r \ n r -I 1 P 

> ni^(^^)i" fc 

[i=l * j=l 3 ) k=l 

is proven under the additional assumption that \Rk \ = \Ck\ = N for k = 1, . . . ,p. 
Theorem 4.1 raises a natural question whether stronger inequalities hold. 

(4.2) Brunn-Minkowski inequalities. 

(4-2.1) Question. Is it true that under the conditions of Theorem 4.1 we have 

p 

\A (R,C)\ > H\A (R k ,C k )f k ? 
k=i 

(4-2.2) Question. Is it true that under the conditions of Theorem 4.1 we have 

p 



\A+(R,C)\ > H\A + (R k ,C k )f k ? 



k=i 

Should they hold, inequalities of (4.2.1) and (4.2.2) would be natural examples 
of discrete Brunn-Minkowski inequalities, see [Ga02] for a survey. 

Some known simpler inequalities are consistent with the inequalities of (4.2.1)- 
(4.2.2). Let X = (x\, . . . , x p ) and Y = (y\, . . . , y p ) be non- negative integer vectors 
such that 

xi > x 2 > - - - > x p and y\ > yi > . . . > y p . 
We say that X dominates Y if 



p 

> for fe = l,...,p-l and ^2 l x i = ^y i . 

i=l i=l i=l i=l 

Equivalently, X dominates Y if Y is a convex combination of vectors obtained from 
X by permutations of coordinates. 
One can show that 

(4.2.3) \A (R,C)\ > \A (R',C')\ and \A+(R,C)\ > \A+(R',C')\ 

provided R' dominates R and C dominates C, see Chapter 16 of [LW01] and [Ba07]. 
Inequalities (4.2.3) are consistent with the inequalities of (4.2.1) and (4.2.2). 
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5. Dependence between row and column sums 



The following attractive "independence heuristic" for estimating \A (R : C) \ and 
\A + (R,C) \ was discussed by Good [Go77] and by Good and Crook [GC76]. 

(5.1) The independence heuristic. Let us consider the set of all m x n matrices 
D = (dij) with 0-1 entries and the total sum N of entries as a finite probability 
space with the uniform measure. Let us consider the event TZo consisting of the 
matrices with the row sums R = (r±, . . . ,r m ) and the event Co consisting of the 
matrices with the column sums C = (ci, . . . , c n ). Then 

*™={ N ) n(J ** =( w ) n(J- 



In addition, 



a (r, c) = n nc . 



If we assume that events TZo and Co are independent, we obtain the following 

independence estimate 

v^,=( 7 )-n(;)n(3 

for the number \Aq(R, C)\ of 0-1 matrices with row sums R and column sums C. 

Similarly, let us consider the set of all mxn matrices D = (dij) with non- negative 
integer entries and the total sum N of entries as a finite probability space with the 
uniform measure. Let us consider the event 7Z+ consisting of the matrices with the 
row sums R = (r\, . . . , r m ) and the event C+ consisting of the matrices with the 
column sums C = (ci, . . . , c n ). Then 

-1 m 



^ x f N + mn — 1\ -r-r (ri + n — 1\ 
Pr (K+ )=( II . and 

\ mn — 1 / J - J -\ n — 1 I 

p r(c+) =( iv+ "«"- i )" 1 n 

\ mn — 1 / J ^ J - 



m — 1 



We have 



a + (.r, c) = ^+ nc+. 



If we assume that events TZ + and C+ are independent, we obtain the independence 
estimate 

m - r:rr Tn ( rt :: r n (*;_ r > 
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Interestingly, the independence estimates Iq(R, C) and i+(-R, C) provide reasonable 
approximations to \A (R, C) | and \A + (R, C) | respectively in the following two cases: 

in the case of equal margins, when 

n = . . . = r m = r and c x = . . . = c n = c, 

see [C+08] and [C+07] 
in the sparse case, when 

max ri <n and max c. Cm, 

i=l,...,m j=l,...,n 

see [G+06] and [GM08]. 

We will see in Section 5.4 that the independence estimates provide the correct 
logarithmic asymptotics in the case when all row sums are equal or all column sums 
are equal. However, if both row and column sums are sufficiently far away from 
being uniform and sparse, the independence estimates, generally speaking, pro- 
vide poor approximations. Moreover, in the case of 0-1 matrices the independence 
estimate Iq(R,C) typically grossly overestimates \Aq(R, C)\ while in the case of 
non- negative integer matrices the independence estimate I + (R, C) typically grossly 
underestimates \A + (R, C)\. In other words, for typical margins R and C the events 
TZq and Cq repel each other (the events are negatively correlated) while events 7Z+ 
and C_|_ attract each other (the events are positively correlated). To see why this is 
the case, we write the estimates ao(R, C) of Theorem 2.1 and a + (R, C) of Theorem 
3.1 in terms of entropy. 

The following result is proven in [BalOa]. 

(5.2) Lemma. Let Po(R, C) be the polytope of all m x n matrices X = (x{j) with 
row sums R, column sums C and such that < Xij < 1 for all i and j . Suppose that 
polytope Pq(R,C) has a non-empty interior, that is contains a matrix Y = (yij) 
such that < y^ < 1 for all i and j . Let us define a function h : Pq(R, C) — > R 
by 

h(X) = J^xyln + (1 -x y )ln— ^ — for X e Po(R, C). 

Then h is a strictly concave function on of Pq(R, C) and hence attains its maximum 
on P{R,C) at a unique matrix Zq = (zij), which we call the maximum entropy 
matrix. Moreover, 

(1) We have < < 1 for all i and j ; 

(2) The infimum ao(R,C) of Theorem 2.1 is attained at some particular point 

( x >y); 

(3) We have a (R, C) = e h ^ . 
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Sketch of Proof. It is straightforward to check that h is strictly concave and that 



3 h{X) In ' 



dxij Xij 

In particular, the (right) derivative at x^ = is +oo, the (left) derivative at x^ = 1 
is — oo and the derivative for < Xij < 1 is finite. Hence the maximum entropy 
matrix Z must have all entries strictly between and 1, since otherwise we can 
increase the value of h by perturbing Zq in the direction of a matrix Y from the 
interior of Pq(R,C). This proves Part (1). 

The Lagrange optimality conditions imply that 

In ^ = —Xi — fj,j for all i,j 

Zij 

and some numbers Ai, . . . , A m and . . . , fi n . Hence 

(5.2.1) Zii = r— i — for all i, j. 

In particular, 

m 



(5.2.2) 



Z—^ \ _|_ e Xi+nj 



^ — — Cj for j = 1, ... ,n and 



i=l 



t j — = for i = l,...,m. 

1 + e Xi+Hj 

3=1 



Equations (5.2.2) imply that point s = (Ai, . . . , A m ) and t = (jUi, . . . , fx n ) is a crit- 
ical point of function Gq(s, t) defined by (2.2.1) and hence the infimum ao(R, C) 
of .Fo(x, y) is attained at Xi = e Xi for % = 1, . . . , m and x/j = e Mj for j = 1, . . . , n. 
Hence Part (2) follows. Using (5.2.1) it is then straightforward to check that 
Fo(x, y) = e h ( z ") for the minimum point (x, y). □ 

We note that 

h(x) = a; In- + (1 - x) In - for < x < 1 

x x 

is the entropy of the Bernoulli random variable with expectation x, see Section 6. 
The following result is proven in [Ba09]. 

(5.3) Lemma. Let P + (R,C) be the polytope of all non-negative m x n matrices 
X = (x^) with row sums R and column sums C . Let us define a function g : 
P+{R,C) — > R by 

g(X) = ^2(x i j + l)\n(l + x ij )-x ij \nx i j for XeP+(R,C). 

i,3 
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Then g is a strictly concave function on P + (R, C) and hence attains its maximum 
on P + (R,C) at a unique matrix Z + = (zij), which we call the maximum entropy 
matrix. Moreover, 

(1) We have Zjj > for all i,j and 

(2) For the minimum a + (R,C) of Theorem 3.1, we have a+(R,C) = e 9 ^ z+ \ 
Sketch of Proof. It is straightforward to check that g is strictly concave and that 



g(X) = In — — ^ for all i, j. 



In particular, the (left) derivative is +oo for Xij = and finite for every x^ > 0. 
Since Pj v {R 1 C) contains an interior point (for example, matrix Y = (yij) with 
Vij = r i c j/N), arguing as in the proof of Lemma 5.2, we obtain Part (1). 
The Lagrange optimality conditions imply that 

In Z ^ = Xi + fij for all i,j 

Zij 

and some numbers Ai, . . . , A m and /xi, . . . , /i n . Hence 
(5.3.1) z»j = t for all i, j. 



In particular, 



(5.3.2) 



™ e -A»-/ij 

> r = Cj for 7 = 1 ..... 77. and 

i=m 

> r = ri for i = l,...,m. 

Z-^i I _|_ g-Ai-Mj 



J = l 

Equations (5.3.2) imply that the point s = (Ai, . . . , A m ) and t = (^i, . . . , /j, n ) is 
a critical point of function G+(s,t) defined by (3.2.1) and hence the minimum 
a + (R, C) of i*+(x, y) is attained at Xi = e Xi for % = 1, . . . , m and j/j = e Mj for j = 
1, . . . ,n. Using (5.3.1), it is then straightforward to check that F + (x, y) = e h ( z +^ 
for the minimum point (x, y ) . □ 

We note that 



g(x) = (x + 1) ln(x + 1) — x \nx for x > 



is the entropy of the geometric random variable with expectation x, see Section 6. 
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(5.4) Estimates of the cardinality via entropy. Let 

k 1 

H(pi,... ,p k ) = Vpiln — 



be the entropy function defined on /c-tuples (probability distributions) pi, ■ ■ ■ ,Pk 
such that pi + . . . + pt = 1 and pi > for i = 1, . . . , k. Assuming that polytope 
Po(R, C) of Lemma 5.2 has a non-empty interior, we can write 

\na (R,C) =Nii(j^; ij) + (mn-N)u(J^-^- 
— NlnN - (mn - N) \n(mn - N), 

where Zq = (zij) is the maximum entropy matrix. On the other hand, for the 
independence estimate (5.1.1), we have 

lnI (R,C) =iVH(^; i) + (mn - N)H ( - - n • i) 
\N J \mn — N ) 

+ J VH(|; J -) +(m n-iV)H(£^ 1J -) 

— NlnN — (mn — N) ln(mn — N) + 0((m + n) ln(mn)). 

Using the inequality which relates the entropy of a distribution and the entropy of 
its margins, see, for example, [Kh57], we obtain 

(5.4.1) H(!;«)sH(|i) + H(|;i) 

with the equality if and only if 



Zij = r -jj L for all i, j 



and 



(5.4.2) H (l^ii-; < H (jiZTL, i) + H ( j 

\ ran — N J \ mn — N J \ mn — N 

with the equality if and only if 

(n - n) (m - Cj) . . 

1 — = — — — for all i, j. 

mn — N 

Thus we have equalities in (5.4.1) and (5.4.2) if and only if 

(rim — N) (cjn — N) = for all 
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that is, when all row sums are equal or all column sums are equal. In that case 
Iq(R,C) estimates \Aq(R, C)\ within an (mn)°( m+n * ) factor. In all other cases, 
Io(R,C) overestimates \Aq(R, C)\ by as much as a 2 fi< - mn - ) factor as long as the 
differences between the right hand sides and left hand sides of (5.4.1) and (5.4.2) 
multiplied by N and (mn — N) respectively overcome the 0((m + n) ln(mn)) error 
term, see also Section 5.5 for a particular family of examples. 

We handle non-negative integer matrices slightly differently. For the indepen- 
dence estimate (5.1.2) we obtain 

r i + U . \ ( AT m ( C j + m 



In I+(R,C) =-(N + mn)ll — — ; i - (iV + mn)H ; j 

1 N + mn J \N + mn 



C 3 



— Ti In ri — Cj In < 

i=i j=i 

+ A^lniV + (N + mn) ln(N + mn) +0((m + n) In AT) 
On the other hand, by Lemma 5.3 we have 

In a+(R, C)=g(Z+) > g(Y), 
where Z + is the maximum entropy matrix and Y = (yij) is the matrix defined by 

Vij = ^ for all i, j. 



It is then easy to check that 



^ = ^ N + m ^ H (mT^ry *•! 



nc-j + N 



- S ^ j r l lnri - J^c, ln< 

i=l j=l 

+ iVlniV + (N + mn) \rv(N + mn). 

By the inequality relating the entropy of a distribution and the entropy of its 
margins [Kh57], we have 

ncj + N \ ( Ti + n \ ( Cj + m 



(5.4.3) H — ^- -; i,j < H ; i \+ H ; j 

\N(N + mn) J \N + mn J \N + mn 



with the equality if and only if 

TiCj + N (ri + n)(cj + m) 

N(N + mn) ~ (N + mn) 2 
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for all 



that is, when we have 



(r^m — N) (cjn — N) = for all 

so that all row sums are equal or all column sums are equal. In that case, by 
symmetry we have Y = Z + and hence I + (R,C) estimates \A + (R, C)\ within an 

N 0(m+n) factQr _ j n aU Qther 

cases, I + (R,C) underestimates \A + (R, C)\ by as 
much as a 2 n ( mn ) factor as long as the difference between the right hand side and 
left hand side of (5.4.3) multiplied by N + mn overcomes the 0((m + n) In AT) error 
term, see also Section 5.5 for a particular family of examples. 

(5.5) Cloning margins. 

Let us choose a positive integer m- vector R = (ri, . . . , r m ) and a positive integer 
n- vector C = (c\, . . . , c n ) such that 

7*1 + . . . + r m = ci + . . . + c n = N. 

For a positive integer k, let us define a /cm-vector Rj~ and a /cn-vector Ck by 

Rk = ( k ri, . . . , kri , . . . , kr m , . . . , fc r m ) and 

fe times A: times 

Ck = I feci, . . . , kc±, • • • , ^c n , . . . , kc r 



-■n 

A: times A; times 



We say that margins (Rf-,Ck) are obtained by cloning from margins (R,C). It is 
not hard to show that if Z and Z + are the maximum entropy matrices associated 
with margins (R, C) via Lemma 5.2 and Lemma 5.3 respectively, then the maximum 
entropy matrices associated with margins (Rk, Ck) are the Kronecker products Zq® 
Idk and Z + <g> Idk respectively, where Idk is the k x k identity matrix. One has 



|i/fc 2 

lim \A + (R k ,Ck)\ 1/k ' Z =a+(R,C). 



lim \Ao(Rk,C k )r k =a (R,C) and 

fc — y+ao 



> + QO 

Moreover, if not all coordinates of i? are equal and not all coordinates Cj of 
C are equal then the independence estimate Iq (Rk,Ck), see (5.1.1), overestimates 
the number of km x kn matrices with row sums Rk and column sums Ck and 0- 
1 entries within a 2 n ( fc ) factor while the independence estimate i+ (Rk,Ck), see 
(5.1.2), underestimates the number of kmx kn non-negative integer matrices within 
a 2 n ( fc2 ) factor, see [BalOa] and [Ba09] for details. 
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6. Random matrices with prescribed row and column sums 

Estimates of Theorems 2.1 and 3.1, however crude, allow us to obtain a descrip- 
tion of a random or typical matrix from sets A (R,C) and A + (R,C), considered 
as finite probability spaces with the uniform measures. 

Recall that x is a Bernoulli random variable if 

Pr {x = 0} = p and Pr {x = 1} = q 

for some p, q > such that p + q = 1. Clearly, Ei = q. 

Recall that Pq(R, C) is the polytope of mxn matrices with row sums R, column 
sums C and entries between and 1. Let function h : P (R,C) — > M. and the 
maximum entropy matrix Zq G Po(R, C) be defined as in Lemma 5.2. 

The following result is proven in [BalOa], see also [BHIOa]. 

(6.1) Theorem. Suppose that polytope Pq(R, C) has a non-empty interior and let 
Zq G Pq{R, C) be the maximum entropy matrix. Let X = (xij) be a random m x n 
matrix of independent Bernoulli random variables Xij such that E X = Zq. Then 

(1) The probability mass function of X is constant on the set Aq(R,C) of 0-1 
matrices with row sums R and column sums C and 

Pr {X = D} = e- h ( Zo) for all DeA (R,C); 

(2) We have 

Pr {X E Aq(R, C)} > (mn)-^ m+n \ 
where 7 > is an absolute constant. 

Theorem 6.1 implies that in many respects a random matrix D e Aq(R,C) 
behaves as a random matrix X of independent Bernoulli random variables such that 
EX = Zq, where Zq is the maximum entropy matrix. More precisely, any event 
that is sufficiently rare for the random matrix X (that is, an event the probability 
of which is essentially smaller than (mn)~ 0(m+n ' ) ), will also be a rare event for a 
random matrix D G Aq(R, C). In particular, we can conclude that a typical matrix 
D G A Q (R, C) is sufficiently close to Zq as long as sums of entries over sufficiently 
large subsets 5 of indices are concerned. 

For an m x n matrix B = (bij) and a subset 




let 

(75(B) = 

be the sum of the entries of B indexed by set S. We obtain the following corollary, 
see [BalOa] for details. 
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(6.2) Corollary. Let us fix real numbers k > and < 5 < 1. Then there exists 
a number q = q(n, 5) > such that the following holds. 

Let (R,C) be margins such that n > m > q and the polytope Pq(R,C) has a 
non-empty interior and let Zq G Pq(R,C) be the maximum entropy matrix. Let 
S C '■ i = 1, . . . ,m; j = 1, . . . , n} be a set such that as (Zq) > 5mn and 

let 




If e < 1 then 

Pr[DeA (R,C): (l-e)a s (Z ) < * S (D) < (1 + e)a s (Z )} > l-n~ Kn . 

Recall that x is a geometric random variable if 

Vr{x = k}=pq k for A; = 0,1,2,... 

for some p,q>0 such that p + q = 1. We have Ex = q/p. 

Recall that P+(R, C) is the polytope ofmxn non- negative matrices with row- 
sums R and column sums C. Let function g : P+(R, C) — > M and the maximum 
entropy matrix Z + G Pq(R,C) be defined as in Lemma 5.3. 

The following result is proven in [BalOb], see also [BHIOa]. 

(6.3) Theorem. Let Z + e Pq(R,C) be the maximum entropy matrix. Let X = 
(xij) be a random mxn matrix of independent geometric random variables Xij such 
that EX = Z+. Then 

(1) The probability mass function of X is constant on the set A + (R, C) of non- 
negative integer matrices with row sums R and column sums C and 

Pr{X = D} = e- 9{z+) for all D e A+(R,C); 

(2) We have 

Pr {X G A + (R, C)} > N~^ m+n \ 

where 7 > is an absolute constant and N = r\ + . . . + r m = c\ + . . . + c n 
for R = (n, . . . ,r m ) and C = (a, ... ,c n ). 

Theorem 6.3 implies that in many respects a random matrix D G A+(R,C) 
behaves as a matrix X of independent geometric random variables such that E X = 
Z + , where Z + is the maximum entropy matrix. More precisely, any event that is 
sufficiently rare for the random matrix X (that is, an event the probability of which 
is essentially smaller than ]V~ 0(m+n )), will also be a rare event for a random matrix 
D G A + (R, C) . In particular, we can conclude that a typical matrix D G A + (R, C) 
is sufficiently close to Z+ as long as sums of entries over sufficiently large subsets 
S of indices are concerned. 

Recall that o~s(B) denotes the sum of the entries of a matrix B indexed by a set 
S. We obtain the following corollary, see [BalOb] for details. 
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(6.4) Corollary. Let us fix real numbers k > and < 5 < 1. Then there exists 
a positive integer q = q(n, 5) such that the following holds. 

Let R = (ri, . . . , r m ) and C = (ci, . . . , c m ) 6e positive integer vectors such that 
n + ■ ■ ■ + r m = ci + . . . + c n = N, 

< fi < - — for t = l,...m, 

m dm 



SN N 

<cj < — for j = 1,... ,n 

n on 



and 



— 6. 



mn 



Suppose that n > m > q and let S C ■ i = 1, . . . ,m, j = 1, . . . ,n} 6e a 

set siic/i t/iat \S\ > dmn. Let Z + e P+(R,C) be the maximum entropy matrix and 
let 

. Inn 

If e < 1 t/ien 



PrjDGA+^C): (l-e)<7 S (Z+) < a s (£>) < (l + e)a s (Z+)} > 1 - 



-Kll 



As is discussed in [BHIOa], the ultimate reason why Theorems 6.1 and 6.3 hold 
true is that 

the matrix X of independent Bernoulli random variables such that E X = Zq is 
the random matrix with the maximum possible entropy among all random m x n 
matrices with 0-1 entries and the expectation in the affine subspace of the matrices 
with row sums R and column sums C 

and 

the matrix X of independent geometric random variables such that E X = Z + 
is the random matrix with the maximum possible entropy among all random mxn 
matrices with non-negative integer entries and the expectation in the affine subspace 
of the matrices with row sums R and column sums C. 

Thus Theorems 6.1 and 6.3 can be considered as an illustration of the Good's 
thesis [Go63] that the "null hypothesis" for an unknown probability distribution 
from a given class should be the hypothesis that the unknown distribution is, in 
fact, the distribution of the maximum entropy in the given class. 

(6.5) Sketch of proof of Theorem 6.1. Let Z = (zij) be the maximum entropy 
matrix as in Lemma 5.2. Let us choose D e Aq(R, C), D = (dij). Using (5.2.1), 
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we get 

Pr{X = D) || (1 - = nfr^w- 

{m n "1,1 

e A * r * + e w f n 1+e x i+H 
i=l j=l J ij 

=e~ h{z °\ 

which proves Part (1). 

To prove Part (2), we use Part (1), Theorem 2.1 and Lemma 5.2. We have 

Pr {X e A (R, C) } = \A Q {R, C) \ e~ h ^ > (mn) 
=(mn)- 7(m+n) 

for some absolute constant 7 > 0. □ 

(6.6) Sketch of proof of Theorem 6.3. Let Z + = (zij) be the maximum entropy 
matrix as in Lemma 5.3. Let us choose D e A + {R,C), D = (dij). Using (5.3.1), 
we get 

Pr {X = D) =n i^A {t^-T = II ^ - e " Al " M e~^ d - 

i,j ^ l 3 / \ l 3 / jj 

{m n i 

- Yi x ^ - e \ n i 1 - e ~ Xt ~ H ) 
i=l j=l J ij 

which proves Part (1). 

To prove Part (2), we use Part (1), Theorem 3.1 and Lemma 5.3. We have 

Pr{X e A+(R,C)} =\A + {R,C)\e-^ z + ) > N-^ m+ ^a + (R,C)e- 9{z +^ 

—]\f-j(m+n) 

for some absolute constant 7 > 0. □ 

(6.7) Open questions. Theorems 6.1 and 6.3 show that a random matrix D e 
A (R,C), respectively D e A + (R,C), in many respects behaves like a matrix of 
independent Bernoulli, respectively geometric, random variables whose expectation 
is the maximum entropy matrix Zq, respectively Z+. One can ask whether indi- 
vidual entries dij of D behave asymptotically as Bernoulli, respectively geometric, 
random variables with expectations Zij as the size of the matrices grows. In the 
simplest situation we ask the following 
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(6.7.1) Question. Let (R,C) be margins and let (Rk,Ck) be margins obtained 
from (R,C) by cloning as in Section 5.5. Is it true that as k grows, the entry dn 
of a random matrix D G A Q {R^Ck), respectively D G A + (Rk,Ck), converges in 
distribution to the Bernoulli, respectively geometric, random variable with expec- 
tation zn, where Zq = (zij), respectively Z + = (zij), is the maximum entropy 
matrix of margins (R, C)? 

Some entries of the maximum entropy matrix Z + may turn out to be surprisingly 
large, even for reasonably looking margins. In [BalOb], the following example is 
considered. Suppose that m = n and let R n = C n = (3n, n, ... ,n). It turns 
out that the entry z\\ of the maximum entropy matrix Z+ is linear in n, namely 
zn > 0.58n, while all other entries remain bounded by a constant. One can ask 
whether the dn entry of a random matrix D G A + (R n , C n ) is indeed large, as the 
value of z\\ suggests. 

(6.7.2) Question. Let (R n , C n ) be margins as above. Is it true that as n grows, 
one has Edn = Q(n) for a random matrix D G A + (R n , C n )? 

Curiously, the entry z\\ becomes bounded by a constant if 3n is replaced by 2n. 

7. Asymptotic formulas for the number of 
matrices with prescribed row and column sums 

In this section, we discuss asymptotically exact estimates for \A (R,C)\ and 
\A + (R,C)\. 

(7.1) An asymptotic formula for \A (R,C)\. Theorem 6.1 suggests the fol- 
lowing way to estimate the number \Aq(R, C)\ of 0-1 matrices with row sums R 
and column sums C. Let us consider the matrix of independent Bernoulli random 
variables as in Theorem 6.1 and let Y be the random (m + n)-vector obtained by 
computing the row and column sums of X. Then, by Theorem 6.1, we have 

(7.1.1) \A (R,C)\ =e h(z '^Pr{X G A (R,C)} = e h ^Pr {Y = (R,C)}. 

Now, random (m + n)-vector Y is obtained as a sum of mn independent random 
vectors and E Y = (R, C), so it is not unreasonable to assume that Pr {y = (R, C) } 
can be estimated via some version of the Local Central Limit Theorem. In [BHIOb] 
we show that this is indeed the case provided one employs the Edgeworth correction 
factor in the Central Limit Theorem. 

We introduce the necessary objects to state the asymptotic formula for the num- 
ber of 0-1 matrices with row sums R and column sums C. 

Let Zq = (zij) be the maximum entropy matrix as in Lemma 5.2. We assume 
that < z^ < 1 for all i and j. Let us consider the quadratic form q : W m + n — > R 
defined by 

1o(s, t)=- ~ 4) ( s * + t if 

l<i<m 
l<j<n 

for s = (si, . . . ,s m ) and t = (ti, . . . , t n ) . 
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Quadratic form q is positive semidefinite with the kernel spanned by vector 



u = I 1, . . . , 1; -1, 



m times n times 

Let H = u 1 - be the hyperplane in R m + n defined by the equation 
(7.1.2) s 1 + ... + s m =t l + ... + t n . 

Then the restriction qo\H of go onto H is a positive definite quadratic form and 
we define its determinant det qo\H as the product of the non-zero eigenvalues of qo. 
We consider the Gaussian probability measure on H with the density proportional 
to e~ qo and define random variables (f>o,ipo : H — > R by 

o (s, t) =- Zij (1 - Zij) (2zij - 1) (si + tj) and 

l<i<m 
l<j<n 



l<i<m 
l<j<n 

for (s,£) = (si, ... ,s m ; ,t n ) . 



We let 

iUo = E0q and z/ =E-i/V 

(7.2) Theorem. Let us fix < 5 < 1/2, let R = (ri, . . . , r m ) and C = (ci, . . . , c n ) 
be margins such that m > 5n and n > dm. Let Zq = (zij) be the maximum entropy 
matrix as in Lemma 5.2 and suppose that 5 < < 1 — 5 for all i and j . 

Let the quadratic form qo and values /iq and u be as defined in Section 7.1. 
Then the number 



(7.2.1) — — . exp < — — + up > 

(4tt) VdetQol^ I 2 



approximates the number \Aq(R,C)\ of as m,n — > +oo within a relative error 
which approaches as m,n — > +oo. More precisely, for any < e < 1/2, the 
value of (7.2.1) approximates \A (R,C)\ within relative error e provided 

m, n > I - 

for some 7(0") > 0. 

Some remarks are in order. 
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All the ingredients of formula (7.2.1) are efficiently computable, in time poly- 
nomial in m + n, see [BHIOb] for details. If all row sums are equal then we have 
Zij = Cj/m by symmetry and if all column sums are equal, we have z^ = ri/n. In 
particular, if all row sums are equal and if all column sums are equal, we obtain 
the asymptotic formula of [C+08]. 

Let us consider formula (7.1.1). If, in the spirit of the Local Central Limit 
Theorem, we approximated Pr {Y = (R,C)} by Pr {Y* e (R,C) + U}, where 
Y* is the (m + n — l)-dimensional random Gaussian vector whose expectation 
and covariance matrix match those of Y and where IT is the set of points on the 
hyperplane H that are closer to (R, C) than to any other integer vector in H, we 
would have obtained the first part 



for some constants ci(S), 02(6) > and this factor represents the Edgeworth correc- 
tion to the Central Limit Theorem. We note that the constraints 5 < Zi 3 < 1 — 5 are, 
generally speaking, unavoidable. If the entries z^ of the maximum entropy matrix 
are uniformly small, then the distribution of the random vector Y of row and col- 
umn sums of the random Bernoulli matrix X is no longer approximately Gaussian 
but approximately Poisson and formula (7.2.1) does not give correct asymptotics. 
The sparse case of small row and column sums is investigated in [G+06]. 

More generally, to have some analytic formula approximating \Aq(R,C)\ we 
need certain regularity conditions on (R,C), since the number \Aq(R, C)\ becomes 
volatile when the margins (R, C) approach the boundary of the Gale-Ryser condi- 
tions, cf. [JSM92] . By requiring that the entries of maximum entropy matrix Zq are 
separated from both and 1, we ensure that the margins (R, C) remain sufficiently 
inside the polyhedron defined by the Gale-Ryser inequality and the number of 0-1 
matrices with row sums R and column sums C changes sufficiently smoothly when 
R and C change. 

(7.3) An asymptotic formula for \A + (R, C)\. As in Theorem 6.3, let X be the 
matrix of independent geometric random variables such that E X = Z+ , where Z+ 
is the maximum entropy matrix. Let Y be the random (m + n)-vector obtained by 
computing the row and column sums of X . Then, by Theorem 6.3, we have 



(7.3.1) \A+(R,C)\ = e 9{z+) Pr{X e A+(R, C) } = e g{z+) Pr {Y = (R, C) }. 



e h (z ) v / m _|_ n 



(4 7r )(m+n-l)/2 v / det Qo \ H 



of formula (7.2.1). Under the conditions of Theorem 7.2 we have 




In [BH09] we show how to estimate the probability that Y = (i?, C) using the Local 
Central Limit Theorem with the Edgeworth correction. 
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Let Z + = (zij) be the maximum entropy matrix as in Lemma 5.3. Let us consider 
the quadratic form q + : R m + n — > K. defined by 



2 

l<i<m 
l<j<n 

for s = (si, . . . ,s m ) and £ = . . . , £ n ) . 

Let if C M m+n be the hyperplane defined by (7.1.2). The restriction q+\H of 
q+ onto if is a positive definite quadratic form and we define its determinant 
det q + \H as the product of the non-zero eigenvalues of q + . We consider the Gaussian 
probability measure on H with the density proportional to e~ q+ and define random 
variables <f> + ,ip + : H — > R by 



1 ^ — 3 

(f) + (s,t)=- (1 + Zij)(2zij + l)(si + tj) and 



l<i<m 
l<j<n 



l<i<m 
l<j<n 

for (s,t) = (si, . . . , s m ;ti, . . . ,t n ) . 



We let 

//+ = E + and v+ = E ^_|_ . 

(7.4) Theorem. Let «s < <5 < 1, let R = {r\, . . . , r m ) and C = (ci, . . . , c n ) 
6e margins such that m > 5n and n > dm. Let Z + = (zij) be the maximum entropy 
matrix as in Lemma 5.3. Suppose that 

St < z^ < t for all i,j 

for some r > 5. 

Let the quadratic form q + and values fi + and u + be as defined in Section 7.3. 
Then the number 

.„ A _ e g ( z +)y/m + n r fj, + "i 

V ' (4tt) (m+n-i)/ F l 2 + J 

approximates the number \A + (R,C)\ of as m,n — > +oo within a relative error 
which approaches as m,n — > +oo. More precisely, for any < e < 1/2, the 
value of (7.4-1) approximates \A+(R,C)\ within relative error e provided 



m,n > 
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X \ 7(5) 



for some 7(5) > 0. 

All the ingredients of formula (7.4.1) are efficiently computable, in time poly- 
nomial in m + n, see [BH09] for details. If all row sums are equal then we have 
Zij = Cj/m by symmetry and if all column sums are equal, we have Zij = Ti/n. In 
particular, if all row sums are equal and if all column sums are equal, we obtain 
the asymptotic formula of [C+07]. The term 



corresponds to the Gaussian approximation for the distribution of the random vec- 
tor Y in (7.3.1), while 



is the Edgeworth correction factor. 

While the requirement that the entries of the maximum entropy matrix Z + 
are separated from is unavoidable (if z^j are small, the coordinates of Y are 
asymptotically Poisson, not Gaussian, see [GM08] for the analysis of the sparse 
case), it is not clear whether the requirement that all within a constant 

factor of each other is indeed needed. It could be that around certain margins (R, C) 
the number \A + (R, C)\ experiences sudden jumps, as the margins change, which 
precludes the existence of an analytic expression similar to (7.4.1) for \A + (R,C)\. 
A candidate for such an abnormal behavior is supplied by the margins discussed 
in Section 6.7. Namely, if m = n and R = C = (An, n, ... ,n) then for A = 2 
all the entries of the maximum entropy matrix Z + are 0(1), while for A = 3 the 
first entry z\\ grows linearly in n. Hence for some particular A between 2 and 3 
a certain "phase transition'" occurs: the entry z\\ jumps from 0(1) to Q(n). It 
would be interesting to find out if there is indeed a sharp change in \A + (R, C)\ 
when A changes from 2 to 3. 



Method of Sections 6 and 7 have been applied to some related problems, such 
as counting higher-order "tensors" with 0-1 or non-negative integer entries and 
prescribed sums along coordinate hyperplanes [BHIOa] and counting graphs with 
prescribed degrees of vertices [BHIOb], which corresponds to counting symmetric 
0-1 matrices with zero trace and prescribed row (column) sums. 

In general, the problem can be described as follows: we have a polytope P C M. d 
defined as the intersection of the non-negative orthant with an affine subspace 
A in R d and we construct a d- vector X of independent Bernoulli (in the 0-1 case) 
or geometric (in the non-negative integer case) random variables, so that the ex- 
pectation of X lies in A and the distribution of X is uniform, when restricted onto 
the set of 0-1 or integer points in P. Random vector X is determined by its ex- 
pectation EX = z and z is found by solving a convex optimization problem on 



e 9( z +)y/m + n 



(4 7 r)( m + n - 1 )/ 2 v /det q+\H 




8. Concluding remarks 
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P. Since vector X conditioned on the set of 0-1 or no n- negative integer vectors in 
P is uniform, the number of 0-1 or non-negative integer points in P is expressed 
in terms of the probability that X lies in A. Assuming that the affine subspace 
A is defined by a system Ax = b of linear equations, where A is k x d matrix of 
rank k < d, we define a /c-vector Y = AX of random variables and estimate the 
probability that Y = b by using a Local Central Limit Theorem type argument. 
Here we essentially use that EY = b, since the expectation of X lies in A. 

Not surprisingly, the argument works the easiest when the codimension k of 
the affine subspace (and hence the dimension of vector Y) is small. In particular, 
counting higher-order "tensors" is easier than counting matrices, the need in the 
Edgeworth correction factor, for example, disappears as the vector Y turns out to 
be closer in distribution to a Gaussian vector, see [BHIOa]. Once a Gaussian or 
almost Gaussian estimate for the probability Pr {Y = 6} is established, one can 
claim a certain concentration of a random 0-1 or integer point in P around z = EX. 



References 



[Ba07] A. Barvinok, Brunn- Minkowski inequalities for contingency tables and integer flows, 

Advances in Mathematics 211 (2007), 105-122. 
[Ba09] A. Barvinok, Asymptotic estimates for the number of contingency tables, integer flows, 

and volumes of transportation polytopes, International Mathematics Research Notices 

2009 (2009), 348-385. 

[BalOa] A. Barvinok, On the number of matrices and a random matrix with prescribed row and 
column sums and 0-1 entries, Advances in Mathematics 224 (2010), 316-339. 

[BalOb] A. Barvinok, What does a random contingency table look like?, Combinatorics, Prob- 
ability and Computing 19 (2010), 517-539. 

[BH09] A. Barvinok and J. A. Hartigan, An asymptotic formula for the number of non-negative 
integer matrices with prescribed row and column sum, preprint arXiv: 0910. 2477 
(2009). 

[BHIOa] A. Barvinok and J. A. Hartigan, Maximum entropy Gaussian approximation for the 
number of integer points and volumes of polytopes, Advances in Applied Mathematics 
45 (2010), 252-289. 

[BHIOb] A. Barvinok and J. A. Hartigan, The number of graphs and a random graph with a 
given degree sequence, preprint arXiv: 1003.0356 (2010). 

[Be74] E. Bender, The asymptotic number of non-negative integer matrices with given row 
and column sums, Discrete Math. 10 (1974), 217-223. 

[BR91] R.A. Brualdi and H.J. Ryser, Combinatorial Matrix Theory, Encyclopedia of Mathe- 
matics and its Applications, 39, Cambridge University Press, Cambridge, 1991. 

[C+08] E.R. Canfield, C. Greenhill, and B.D. McKay, Asymptotic enumeration of dense 0- 
1 matrices with specified line sums, Journal of Combinatorial Theory. Series A 115 
(2008), 32-66. 

[C+07] E.R. Canfield and B.D. McKay, Asymptotic enumeration of contingency tables with 
constant margins, preprint arXiv math. CO/0703600, Combinatorica, to appear (2007). 

[Ga02] R.J. Gardner, The Brunn- Minkowski inequality, Bull. Amer. Math. Soc. (N.S.) 39 
(2002), 355-405. 

[Go63] I.J. Good, Maximum entropy for hypothesis formulation, especially for multidimen- 
sional contingency tables, Ann. Math. Statist. 34 (1963), 911-934. 

[Go76] I.J. Good, On the application of symmetric Dirichlet distributions and their mixtures 
to contingency tables, Ann. Statist. 4 (1976), 1159-1189. 

29 



[GC77] I.J. Good and J.F. Crook, The enumeration of arrays and a generalization related to 
contingency tables, Discrete Mathematics 19 (1977), 23-45. 

[G+06] C. Greenhill, B.D. McKay, and X. Wang, Asymptotic enumeration of sparse 0-1 ma- 
trices with irregular row and column sums, Journal of Combinatorial Theory. Series A 
113 (2006), 291-324. 

[GM08] C. Greenhill and B.D. McKay, Asymptotic enumeration of sparse nonnegative integer 
matrices with specified row and column sums, Advances in Applied Mathematics 41 
(2008), 59-481. 

[JSM92] M. Jerrum, A. Sinclair and B. McKay, When is a graphical sequence stable?, Random 
Graphs, Vol. 2 (Poznah, 1989), Wiley-Intersci. Publ., Wiley, New York, 1992, pp. 101- 
115. 

[Kh57] A.I. Khinchin, Mathematical Foundations of Information Theory, Dover Publications 
Inc., New York, N. Y., 1957. 

[LW01] J.H. van Lint and R.M. Wilson, A Course in Combinatorics. Second edition, Cam- 
bridge University Press, Cambridge, 2001. 

[NN94] Y. Nesterov and A. Nemirovskii, Interior- Point Polynomial Algorithms in Convex 
Programming, SIAM Studies in Applied Mathematics, 13, Society for Industrial and 
Applied Mathematics (SIAM), Philadelphia, PA, 1994. 

Department of Mathematics, University of Michigan, Ann Arbor, MI 48109-1043, 
USA 

E-mail address: barvinok@umich.edu 



30 



